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Abstract 

Weighted Max-SAT is the optimization version of SAT and many important problems can 
be naturally encoded as such. Solving weighted Max-SAT is an important problem from 
both a theoretical and a practical point of view. In recent years, there has been considerable 
interest in finding efficient solving techniques. Most of this work focus on the computation 
of good quality lower bounds to be used within a branch and bound DPLL-like algorithm. 
Most often, these lower bounds are described in a procedural way. Because of that, it is 
difficult to realize the logic that is behind. 

In this paper we introduce an original framework for Max-SAT that stresses the paral- 
lelism with classical SAT. Then, we extend the two basic SAT solving techniques: search 
and inference. We show that many algorithmic tricks used in state-of-the-art Max-SAT 
solvers are easily expressable in logic terms with our framework in a unified manner. 

Besides, we introduce an original search algorithm that performs a restricted amount 
of weighted resolution at each visited node. We empirically compare our algorithm with a 
variety of solving alternatives on several benchmarks. Our experiments, which constitute to 
the best of our knowledge the most comprehensive Max-sat evaluation ever reported, show 
that our algorithm is generally orders of magnitude faster than any competitor. 
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1 Introduction 



Weighted Max-SAT is the optimization version of the SAT problem and many im- 
portant problems can be naturally expressed as such. They include academic prob- 
lems such as max cut or max clique, as well as real problems in domains like rout- 
ing [3], bioinformatics [4], scheduling \5\, probabilistic reasoning [6], electronic 
markets [7]. In recent years, there has been a considerable effort in finding effi- 
cient exact algorithms. These works can be divided into theoretical [8,9,10] and 
empirical [11,12,13,14]. A common drawback of all these algorithms is that albeit 
the close relationship between SAT and Max-SAT, they cannot be easily described 
with logic terminology. For instance, the contributions of [11,12,13,14] are good 
quality lower bounds to be incorporated into a depth-first branch and bound pro- 
cedure. These lower bounds are mostly defined in a procedural way and it is very 
difficult to see the logic that is behind the execution of the procedure. This is in 
contrast with SAT algorithms where the solving process can be easily decomposed 
into atomic logical steps. 

In this paper we introduce an original framework for (weighted) Max-SAT in which 
the notions of upper and lower bound are incorporated into the problem definition. 
Under this framework classical SAT is just a particular case of Max-SAT, and the 
main SAT solving techniques can be naturally extended. In particular, we extend 
the basic simplification rules (for example, idempotency, absorption, unit clause 
reduction, etc) and introduce a new one, hardening, that does not make sense in the 
SAT context. We also extend the two fundamental SAT algorithms: DPLL (based 
on search) and DP (based on inference). We also show that the complexity of the 
extension of DP is exponential on the formula's induced width (which is hardly 
a surprise, since this is also the case of other inference algorithms for graphical 
models [15,16]). Interestingly, our resolution rule includes, as special cases, many 
techniques spread over the recent Max-SAT literature. One merit of our framework 
is that it allows to see all these techniques as inference rules that transform the 
problem into an equivalent simpler one, as it is customary in the SAT context. 

The second contribution of this paper is more practical. We introduce an original 
search algorithm that incorporates three different forms of resolution at each visited 
node: neighborhood resolution, chain resolution and cycle resolution. Our experi- 
mental results on a variety of domains indicate that our algorithm is orders of mag- 
nitude faster than its competitors. This is especially true as the ratio between the 
number of clauses and the number of variables increases. Note that these are typi- 
cally the hardest instances for Max-SAT. Our experiments include random weighted 
and unweighted Max-SAT instances, random and structured Max-one problems, 
random Max-cut problems, random and structured Max-clique problems and com- 
binatorial auctions. 

Some of the ideas presented in this paper have strong connections to different 
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techniques recently developed in the WCSP field [17]. Especially significant is 
the connection with local consistency [18,19,20,21,22] and variable elimination 
[15,23,24]. 

The structure of the paper is as follows: In Section 2 we review SAT terminology. In 
Section 3 we present Max-SAT and introduce our framework. In Section 4 we ex- 
tend from SAT to Max-SAT the essential solving techniques. Section 5 summarizes 
in a unified way special forms of resolution that can be used to simplify Max-SAT 
formula. Section 6 describes our solver. Section 7 reports our experimental work, 
which corroborate the efficiency of our solver compared to other state-of-the-art 
solving alternatives. Finally, Section 8 concludes and points out directions of fu- 
ture work. 



2 Preliminaries on SAT 

In the sequel X = {xi,X2, . . . is a set of boolean variables. A literal is either a 
variable jc, or its negation Xi. The variable to which literal / refers is noted var{l) 
(namely, var{xi) = var{xi) — Xi). If variable is assigned to true literal x, is satisfied 
and literal is falsified. Similarly, if variable x, is instantiated to false, literal x, is 
satisfied and literal x, is falsified. An assignment is complete if it gives values to 
all the variables in X (otherwise it is partial). A clause C = /i V /2 V . . . V 4 is a 
disjunction of literals such that ^\<i,j<k, i^j var{li) ^ var{lj). It is customary to 
think of a clause as a set of literals, which allows to use the usual set operations. If 
X e C (resp. X e C) we say that x appears in the clause with positive (resp. negative) 
sign. The size of a clause, noted |C|, is the number of literals that it has. var(C) 
is the set of variables that appear in C (namely, var{C) = {var{l)\ I G C}). An 
assignment satisfies a clause iff it satisfies one or more of its literals. Consequently, 
the empty clause, noted □, cannot be satisfied. Sometimes it is convenient to think 
of clause C as its equivalent CV □. A logical formula J in conjunctive normal 
form (CNF) is a conjunction of different clauses, normally expressed as a set. A 
satisfying complete assignment is called a model of the formula. Given a CNF 
formula, the SAT problem consists in determining whether there is any model for 
it or not. The empty formula, noted 0, is trivially satisfiable. A formula containing 
the empty clause is trivially unsatisfiable and we say that it contains an explicit 
contradiction. 



2. 1 Graph concepts [ 25 ] 

The structure of a CNF formula ^ can be described by its interaction graph G(j ) 
containing one vertex associated to each boolean variable. There is an edge for each 
pair of vertices that correspond to variables appearing in the same clause. Given a 
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Fig. 1 . On the left, a graph G. On the right, the induced graph where d is the lexico- 
graphic order. 

graph G and an ordering of its vertices d, the parents of a node is the set of 
vertices connected to Xj that precede Xi in the ordering. The width of x, along d is 
the number of parents that it has. The width of the graph along d, denoted w^, is 
the maximum width among the vertices. 

The induced graph of G(5 ) along d, denoted G^(j ), is obtained as follows: The 
vertices of G are processed from last to first along d. When processing vertex Xi, 
we connect every pair of unconnected parents. The induced width of G along d, 
denoted w^, is the width of the induced graph. The induced width (also known 
as tree-width, k-tree number or the dimension of the graph) is a measure of how 
far a graph is from acyclicity and it is a fundamental structural parameter in the 
characterization of many combinatorial algorithms. Computing the ordering d that 
provides the minimum induced width is an NP-hard problem [26]. 

Example 1 Consider the formula f = {xi y x^.xi \J X:x,X2\/ xj,,X2\/ X4,X2\/ xs.x^M 
xs}. Its interaction graph G{j ) is depicted in Figure 1 (a). The induced graph Gj 
along the lexicographical order is depicted in Figure 1(b). Dotted edge is the only 
new edge with respect the original graph. When processing node x^, no new edges 
are added, became the parents ofx<, are already connected. When processing node 
X4, the edge connecting X2 and xi is added because both variables are parents of 
X4 and they were not connected. When processing xs, X2 and x\, no new edges are 
added. The induced width w*^ is 2 because nodes x^ and x^ have width 2 (namely, 
they have two parents) in the induced graph. 



2.2 SAT algorithms 



CNF formulas can be simplified using equivalences or reductions. Well known 
equivalences are idempotency CAC = C, absorption C A (C V5) = C or unit clause 

reduction I A (/V C) = / A C. A well known reduction is the pure literal rule which 
says that if there is a variable such that it only occurs in either positive or negative 
form, all clauses mentioning it can be discarded from the formula. Simplifications 
can be applied until quiescence. The assignment of true (resp. false) to variable x 
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function DPLL(^ ) return boolean 

1. J := Simplify(J) 

2. if ^ = then return true 

3. if = {□} then return false 

4. / :=SelectLiteral(J) 

5. return DPLL(J[/])VDPLL(J[/]) 
endfunction 

Fig. 2. DPLL is a search algorithm. It returns true iff f is satisfiable. 

in is noted J [x] (resp. J [Jc]) and produces a new formula in which all clauses 
containing x (resp. x) are eliminated from the formula, and x (resp. x) is removed 
from all clauses where it appears. Note that y [I] can be seen as the addition of / to 
the formula and the repeated application of unit clause reduction followed by the 
pure literal rule. 

Algorithms for SAT can be roughly divided into search and inference. The most 
popular search algorithm and the starting point of most state-of-the-art SAT solvers 
was proposed in [27] and is usually called Davis Putnam Logemann Loveland 
(DPLL). Figure 2 provides a recursive description. First, DPLL simplifies its input 
(line 1). If the resulting formula is empty, it reports success (line 2). If the resulting 
formula is a contradiction, it reports failure (line 3). Else it selects a literal / (line 
4) and sequentially assigns the formula with / and / (line 5). 

We say that two clauses x VA, x V5 e J clash iff A V5 is not a tautology (namely, 
V/eA I ^ B) and, is not absorbed (namely. Veer C ^ A VB). The resolution rule, 
{x\/ A,x\/ B} = {x VA,x V5,A V5}, is applied to clashing clauses and is central to 
inference algorithms. Variable x is called the clashing variable and A V5 is called 
the resolvent. Resolution, which is sound and complete, adds to the formula (i.e, 
makes explicit) an implicit relation between A and B. Note that unit clause reduction 
is just a particular case of resolution. 

Two years before DPLL, Davis and Putnam proved that a restricted amount of res- 
olution performed along some ordering of the variables is sufficient for deciding 
satisfiability. The corresponding algorithm is noted DP [28,25]. Figure 3 provides a 
recursive description. It eliminates variables one-by-one until it obtains the empty 
formula or achieves a contradiction. The heart of DP is Function VarEl im. It elim- 
inates variable jc, from formula 5 while preserving its solvability. First, it computes 
the so-called bucket of Xj, noted « , which contains the set of clauses mentioning the 
variable (line 1). All the clauses in the bucket are removed from the formula (line 
2). Next, it applies resolution restricted to the clauses in the bucket while pairs of 
clashing clauses exist. Resolvents are added to the formula (line 6). The correctness 
of DP is based on the fact that clauses added in line 6 keep the essential information 
contained in clauses removed in line 2. Observe that the pure literal rule is just a 
special case of variable elimination in which no pair of clashing clauses exist, so 
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function VarElim(ir , jc,) return CNF formula 

1. « :={Ce ^\xievar{C)} 

2. f ■=f -<B 

3. while3x;VAe S do 

4 X, VA :=PopClause(®) 

5. while 3_^.vBe« s.t. Clash(xj VA,X( V5) do 

6. 5 := 5 U{AVfi} 

7. endwhile 

8. endwhile 

9. return (:F ) 
endfunction 

function DP(y ) return boolean 

10. f := Simplify(jF) 

1 1. if J =0 then return true 
\1AI f = {n} then return false 

13. Xi :=SelectVar(if ) 

14. return DP(VarElim(:r 
endfunction 

Fig. 3. DP is a pure inference algorithm. It returns true iff J is satisfiable. 
the inner loop never iterates. 

The following lemma shows how the complexity of eliminating a variable depends 
on the number of other variables that it interacts with. 

Lemma 2 [25] Let f he a CNF formula and Xi one of its variables. Let ni he 
the number of variables sharing some clause with xi in . The space and time 
complexity ofVarElim(^,Xi) is 0(3"') and 0(9"'), respectively. 

The following lemma shows how the induced graph G*^{f) captures the evolution 
of the interaction graph G(ir ) as variables are eliminated. 

Lemma 3 [25] Let d denote the reverse order in which DP(J^ ) eliminates vari- 
ables. The width ofxi along d in the induced graph )*/ bounds above the num- 
ber of variables sharing some clause with xi at the time of its elimination. 

Thus, the induced width captures the most expensive variable elimination. The fol- 
lowing theorem, which follows from the two previous lemmas, characterizes the 
complexity of DP in terms of the induced width. 

Theorem 4 [25] Let d denote the reverse order in which DP( J ) eliminates vari- 
ables. Let w*^ denote the induced width ofG{f) along d. The space and time com- 
plexity ofDP(!F ) is 0{nx 3^*d) and 0{nx 9^*d), respectively. 

A consequence of the previous theorem is that the order in which DP eliminates 
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variables may be crucial for the algorithm's complexity. As an example, consider a 
formula, whose interaction graph is a tree of depth 1 . If variables are eliminated in 
a top-down order, the cost may be exponential on n. If variables are eliminated in 
a bottom-up order, the cost is linear. In general, finding optimal elimination order- 
ings is an NP-hard problem and approximate algorithms must be used. In practical 
applications, DP is generally too space consuming and cannot be used [25]. Never- 
theless, resolution still plays an important practical role in combination with search: 
the addition of restricted forms of resolution at each search node anticipates the de- 
tection of dead-ends and improves its performance [29,25,30,31]. As we will show, 
the use of resolution is even more relevant in the Max-SAT context. 



3 (Weighted) Max-SAT 



When a boolean formula does not have any model, one may be interested in finding 
a complete assignment with minimum number of violated clauses. This problem 
is known as (unweighted) Max-SAT. Note that no repetition of clauses is allowed 
and all clauses are equally important. The complexity of Max-SAT is p^^[iog«]^ 
meaning that it can be solved with a logarithmic number calls to a NP oracle [32]. 

Weighted Max-SAT is an extension of Max-SAT. A weighted clause is a pair (C, w) 
such that C is a classical clause and w is a natural number indicating the cost of its 
falsification. A weighted formula in conjunctive normal form is a set of weighted 
clauses. The cost of an assignment is the sum of weights of all the clauses that it 
falsifies. Given a weighted formula, weighted Max-SAT is the problem of finding a 
complete assignment with minimal cost. We can assume all clauses in the formula 
being different, since (C, u) , (C, w) can be replaced by (C, m + w) . Note that clauses 
with cost do not have any effect and can be discarded. Weighted Max-SAT is 
more expressive than unweighted Max-SAT and its complexity, P^^ , is higher [32] 
(it may require a linear number of calls to a SAT oracle). Since most Max-SAT 
applications require the expressiveness of weights, in this paper we will focus on 
weighted Max-SAT. In the following, when we say Max-SAT we will be referring 
to weighted Max- SAT. 

Example 5 Given a graph G = {V,E), a vertex covering is a set U CV such that 

for every edge (v/,Vj) either v,- & U or vj G U. The size of a vertex covering is 
\U\. The minimum vertex covering problem is a well-known NP-Hard problem. It 
consists in finding a covering of minimal size. It can be naturally formulated as 
(weighted) Max-SAT. We associate one variable Xj to each graph vertex. Value true 
(respectively, false) indicates that vertex Xj belongs to U (respectively, toV — U). 
There is a binary weighted clause {xi\/Xj,u) for each edge (v,-,Vj) €£', whereuisa 
number larger than \V\. It specifies that at least one of these vertices must be in the 
covering because there is an edge connecting them. There is a unary clause {xi, 1) 
for each variable Xi, in order to specify that it is preferred not to add vertices to U. 
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Note that different weights in unary and binary clauses are required to express the 
relative importance of each type of clauses. 

Consider the minimum vertex covering of the graph in Figure 1 (a). The Max-SAT 

encoding is 7 = {{xi, 1), {x2, 1), (^3, 1), (-^4, 1), fe, 1), {xi\/x4,5), (x2Vx3,5), (x2V 
X4,5), (x2Vx5,5), (x4Vx5,5)}. The optimal assignment is {x2 = JC4 = true,xi =^3 = 
X5 = false} with cost 2 that is equal to the size of the minimum vertex covering. 

Next, we propose an alternative, although equivalent, definition for weighted Max- 
SAT that will be more convenient for our purposes. Given a weighted CNF formula, 
we assume the existence of a known upper bound T on the cost of an optimal solu- 
tion (T is a strictly positive natural number). This is done without loss of generality 
because, if a tight upper bound is not known, T can be set to any number higher 
than the sum of weights of all the clauses. A model for the formula is a complete 
assignment with cost less than T. An optimal model is a model of minimal cost. 
Then, Max-SAT can be reformulated as the problem of finding an optimal model, 
if there is any. Observe that any weight w > T indicates that the associated clause 
must be necessarily satisfied. Thus, we can replace w by T without changing the 
problem. Thus, without loss of generality we assume all costs in the interval [0..T] 
and, following [33], redefine the sum of costs as, 

a(Bb = min{a + b,T} 

in order to keep the result within the interval [0..T]. A clause with cost T is called 
mandatory (or hard). A clause with cost less than T is called non-mandatory (or 
soft). 

Definition 6 A Max-SAT instance is a pair ,T) where T is a natural number 
and f is a set of weighted clauses with weights in the interval [0..T]. The task of 
interest is to find an optimal model, if there is any. 

The following example shows that T can be used to express that we are only inter- 
ested in assignments of a certain quality. 

Example 7 Consider again the minimum vertex covering problem of the graph in 
Figure 1 (a). With the new notation, the associated formula is 

!F ={ (.xi,l),(x2, 1),(X3,1),(X4, l),(x5,l),(jci Vj:2,T),(a:2Vj:3,T), 
(X2 VX4, T), (X2 VX5, T), (X4 VX5, T)} 

which shows more clearly which clauses are truly weighted and which ones are 

mandatory. In the lack of additional information, T should be set to the sum of 
weights (T = 5j, meaning that any assignment that satisfies the mandatory clauses 
should be taken into consideration. Suppose now that somehow (for example, with 
a local search algorithm) we find a covering of size 3. We can set T to 3 because 
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any assignment with cost 3 or higher does not interest us anymore. The resulting 
Max-SAT problem is tighter (and easier, because more partial assignments can be 
identified as unfeasible). 

The interest of adding T to the problem formulation is twofold. On the one hand, it 

allows to explicit the mandatory nature of mandatory clauses. Besides, as we will 
see later, it allows to discover mandatory clauses that were disguised as weighted 
clauses. On the other hand, it allows to see SAT as a particular case of Max-SAT. 

Remark 8 A Max-SAT instance with T = \ is essentially a SAT instance because 
there is no weight below T. Consequently, every clause in the formula is mandatory. 

A weighted CNF formula may contain {^,w) among its clauses. Since □ cannot 
be satisfied, w is a necessary cost of any model. Therefore, w is an explicit lower 
bound of the cost of an optimal model. When the lower bound and the upper bound 
have the same value (i.e., (DiT) E f) the formula is trivially unsatisfiable and 
we call this situation an explicit contradiction. The idea of adding an upper bound 
T and a lower bound to the problem formulation was first proposed in the 

WCSP context [33]. 



4 Extending SAT solving techniques to Max-SAT 

4.1 Extending Simplification Rules and Clause Negation 

We say that two Max-SAT formulas are equivalent, f = f', if they contain the 
same set of variables, and complete assignments have the same costs. The following 
equivalence rules can be used to simplify CNF weighted formulas, 

• Aggregation: {(A, w), (A,m)} = {{A,w®u)} 

• Absorption: {(A, T), (A V5, w)} = {(A, T)} 

• Unit clause reduction: {(/,T), (/ VA,^)} = {(/, T), (A,w)} 

• Hardening: If 0f=i Ui = T and \/\<i<ikCi C Q then 

{{CuUi)}\zlu{{Ck,Uk)} = {(Q,M,)}?=/ U{(Q,T)} 

Aggregation generalizes to Max-SAT the idempotency of the conjunction in classi- 
cal SAT. Absorption rule indicates that in the Max-SAT context the absorbing clause 
must be mandatory. Similarly, unit clause reduction requires the unit clause being 
mandatory. The correctness of these equivalences is direct and we omit the proof. 
The Hardening rule allows to identify weighted clauses that are indeed mandatory. 
It holds because the violation of Q implies the violation of all C, with / < k. There- 
fore, any assignment that violates Q will have cost 0|=i = T. 
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It is easy to see that the pure literal rule can also be applied to Max-SAT. Besides, 
the assignment of a formula f [I] also holds in Max-SAT. As in SAT, it can be 
seen as the addition of (/, T) to the formula which allows a sequence of unit clause 
reductions followed by the application of the pure literal rule. 

Example 9 Consider the following formula { (.jc, T) , (x, 3) , (y, 8) , V 3) } with 
T = 10. We can apply unit clause reduction to the first and second clauses, which 
produces {(x,T), (□,3), (y,8), (xVy,3)}. We can apply it again to the first and 
fourth clauses producing {(x, T), (□, 3), (j, 8), (y, 3)}. The pure literal rule allows 
to remove the first clause producing {(□,3), (j, 8), (y, 3)}. We can harden the sec- 
ond clause because 3 © 8 = T. Thus, we obtain {(□,3), (j, T), (j^,3)}. Unit clause 
reduction produces {(□,3), ()^,T), (□,3)}. Aggregation yields {(□,6), ()^,T)} and 
the pure literal rule produces the formula {(□,6)} which trivially has an optimal 
model of cost 6. 

Proposition 10 The algorithm that applies the previous simplifications until qui- 
escence terminates in polynomial time. 

Observe that if an explicit contradiction is achieved (i.e., (□,T) e J) all clauses 
are subsequently absorbed and the formula immediately collapses to (□, T). 

The negation of a weighted clause (C, w), noted (C, w), means that the satisfaction 
of C has cost w, while its negation is cost-free. Note that C is not clausal when 
|C| > 1 . In classical SAT the De Morgan rule can be used to recover the CNF syntax, 
but the following example shows that it cannot be applied to weighted clauses. 

Example 11 Consider the weighted clause {x\Jy^\) with T > 1. The truth table of 
its negation (xVy, 1) and the truth table of {{x,l),{y,l)} are given below (ignore 
the last column for the moment). Note that they are not equivalent. 



xy 


{xyy,l) 


1)} 


{(;cV3;,l),(:cV3;,l),(xV3;,l)} 


ff 





0©0 = 


o©o©o=o 


ft 


1 


1®0= 1 


0©0©1 = 1 


tf 


1 


0©1 = 1 


0©1©0= 1 


tt 


1 


1©1 =2 


1©0©0= 1 



The following recursive transformation rule allows to recover the clausal form in 
totally or partially negated clauses. Let A and B be arbitrary disjunctions of clauses. 



CNF{AVlVB,u) = < 



(A V /, m) 

{(AV/V5,m)} UCA^F(AV/V5,m) U 
[JCNF{AylyB, u) 



\B\=Q) 



|5| >0 
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The last column in the truth table of the previous example shows the proper CNF 
encoding of clause {xVy, 1). The main drawback of this rule is that it generates an 
exponential number of new clauses with respect the arity of the negated clause. We 
will show in Subsection 4.3 that it is possible to transform it into a linear number 
of clauses. 



Theorem 12 CNF(A V IV B,u) returns an equivalent CNF expression. 



PROOF. It is clear that CNF(A \J l\/B,u) generates a CNF expression because the 
negation is applied to smaller sub-expression at each recursive call. Eventually, it 
will be applied to literals, so the expression will be a clause. We prove that CNF(A V 
/ V 5, m) returns an equivalent expression by induction over |5| . The \B\ = is trivial 
since the left-hand and the right-hand sides are the same. Regarding the |5| > 
case, there are three ways to falsify A\J l\J B. Each one of the three elements in the 
right-hand side corresponds to one of them. The last two are assumed correct by 
the induction hypothesis. 



Remark 13 The weighted expression (A V C V {C\/B),u), where A, B and C are 
disjunctions of literals, is equivalent to (AV CV B, u), because they are falsified 
under the same circumstances. 



4.2 Extending DPLL 

In Figure 4 we present Max-DPLL, the extension of DPLL to Max-SAT. Max- 
DPLL(5 , T) returns the cost of the optimal model if there is any, else it returns T. 
First, the input formula is simplified with the rules from the previous subsection 
(line 1). If the resulting formula is empty, there is a cost model (line 2). If the 
resulting formula only contains the empty clause, the algorithm returns its cost (line 
3). Else, it selects a literal / (line 4) and makes two recursive calls (lines 5 and 6). 
In each call the formula is instantiated with / and /. Observe that the first recursive 
call is made with the T inherited from its parent, but the second call uses the output 
of the first call. This implements the typical upper bound updating of depth-first 
branch and bound. Finally, the best value of the two recursive calls is returned (line 
7). Observe that, as search goes on, the value of T may decrease. Consequently, 
clauses that originally were soft may become hard which, in turn, may strengthen 
the potential of the simplification rules. The parallelism with DPLL (Figure 2) is 
obvious. The following statement shows that Max-DPLL is a true extension of 
classical DPLL. 

Remark 14 The execution of Max-DPLL with a SAT instance (i.e., (iF,T) with 
T = 1) behaves like classical DPLL. 
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function Max-DPLL(ir , T) return nat 

1. if := Simplify(j?',T) 

2. if ^ = then return 

3. ifjF ={(n,ii')} then return w 

4. / :=SelectLiteral(J) 

5. T :=Max-DPLL(j[/],T) 

6. T :=Max-DPLL(j[/],T) 

7. return T 
endfuncdon 



Fig. 4. If (jT , T) has models, Max-DPLL returns the optimal cost. Else it returns T. 

It is easy to see that the time complexity of Max-DPLL is exponential on the num- 
ber of variables n and the space complexity is polynomial on | | . Therefore, DPLL 
and Max-DPLL have the same complexity. 



4.3 Extending the Resolution Rule 



Consider the subtraction of costs (Q) defined as in [34]. Let m, w e [0, . . . , T] be 
two weights such that u>w. 



uQw = 




u — T 



Essentially, behaves like the usual subtraction except that T is an absorbing 
element. The resolution rule can be extended from SAT to Max-SAT as, 



{{x\/A,u),{x\/B,w)} = < 



(AV5,m), 

(jcVA,M0m), 
{x\/B,wem), 
{xVAVB,m), 
{xVAVB,m) 



where m = min{M, w}. In this rule, that we call Max-RES, (A V5,m) is called the 
resolvent; {x\/A,uQm) and {x\/B,wQm) are calledthe posterior clashing clauses. 

V A V 5, m) and (x V A V 5, m) are called the compensation clauses. The effect of 
Max-RES, as in classical resolution, is to infer (namely, make explicit) a connec- 
tion between A and B. However, there is an important difference between classical 
resolution and Max-RES. While classical resolution yields the addition of a new 
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clause, Max-RES is a transfomiation rule. Namely, it requires the replacement of 
the left-hand clauses by the right-hand clauses. The reason is that some cost of 
the prior clashing clauses must be subtracted in order to compensate the new in- 
ferred information. Consequently, Max-RES is better understood as a movement of 
knowledge. 

Example 15 If we apply Max-RES to the following clauses {(x V 3), (.^ V j V 
z,4 )} (wit h T > 4) we obtain {(j V y V z, 3), (x Vy,3 e 3), (.x V j V z,4 e 3), (;c V 
J V (j V z) , 3) , V J V J V z, 3) }. The first and fourth clauses can be simplified. 
The second clause can be omitted because it weight is zero. The fifth clause can 
be omitted because it is a tautology. Therefore, we obtain the equivalent formula 
{(3;Vz,3),(;cV3;Vz,l),(j:V3;Vf,3)} 

The previous example showed that, under certain conditions, some of the right- 
hand side clauses can be removed. Clause (xVA,M0m) (symmetrically for (JcV 
5, w m)) can be omitted iff either, 

• 5CAAm = T, or 

• M = m < T. 

The first case holds because the clause is absorbed by the resolvent (A,T). The 
second case holds because uQm — Q. 

Regarding clause (xVAV5,m) (symmetrically for (x VA V 5, m)), it can be omitted 
iff either, 

• 5CA, or 

• M = T. 

The first case holds because the clause is a tautology. The second case holds be- 
cause the clause is absorbed by the posterior clashing clause {xM A,T Qm — T). 

Remark 16 The application of Max-RES to mandatory clauses is equivalent to 
classical resolution. 



PROOF. Clashing clauses being mandatory means that u = w = T . Clearly, m = 
min{M, w} = T, M0m = T and w 0m = T. Consequently, all right-hand clauses are 
mandatory. Therefore, the prior and posterior clashing clauses are equal. Overmore, 
the compensation clauses are absorbed by the clashing clauses (as we previously 
noted). Thus, Max-RES has the effect of adding (A VB,T) to the formula, which 
is equivalent to classical resolution. 

Theorem 17 Max-RES is sound. 
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PROOF. The following table contains in the first columns all the truth assign- 
ments, in the second column the cost of the assignment according to the clauses on 
the left-hand of the Max-RES rule, and in the third column the cost of the assign- 
ment according to the clauses on the right-hand of the Max-RES rule. As it can be 
observed, the costs the are same, so the resulting problem is equivalent. 
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Observe that compensation clauses (xVAV5,m) and {xM AM B,m) are not in 
clausal form when \A\ > 1 and \B\ > 1. In the following, we assume that they are 
transformed to clausal form as needed. In Subsection 4. 1, we introduced a recursive 
rule that allows to recover the clausal form in totally or partially negated clauses. 
We noted that it produces an exponentially large number of new clauses. Interest- 
ingly, Max-RES allows to redefine it in such a way that only a linear number of 
clauses is generated, 

f A V/ : 151=0 

CiVFiinear(AV/V5,M) = < 

I {(AV/V5,M)}UCA^Fiinear(AV5,M) : |5|>0 



The new rule is correct because the two recursive calls of CNF (Subsection 4.1), 

CNF{A V / V5, u) and CNF{A \/l\/B, u), can be resolved on literal / and we obtain 
the equivalent call CNF{A\/ B,u). For example, the application of CNFu-^ear to 
{xVy, 1) (Example 11) produces the equivalent {(xVj, 1), (j, 1)}. Observe that the 
output of CA^Fiinear depends on how the literals are ordered in the clause. 
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function Max-VarElim(ir , T,;c,) return weighted CNF formula 

1. 'B := {(C,u) e [F\xi evar{C)} 

2. J ■= J -'B 

3. while 3(x/ VA,m) g do 

4 (x,VA,m) :=PopMinSizeClause(«) 

5. while M > OA ^[xi\/B,w)^'B s-t- Clash(x,-VA,x/ V5) do 

6. m := minjwjw} 

7. u := uQm 

8. !S := !S -{(l, V5,w)}U{(x,-V5,wem)} 

9. !S := !S U{(x/VAV5,m)U(Jc/VAV5,m)} 

10. J := J U{(AV5,m)} 

1 1 . endwhile 

12. endwhile 

13. return (^) 
endfunction 

function Max-DP(i?',T) return nat 

14. 5 := Simplify(iF,T) 
15 . if J =0 then return 

16. if J ={(□,«)} then return M 

17. jc/ :=SelectVar(iF) 

18. return Max-DP(VarElim(J , T,x,-),T) 
endfunction 



Fig. 5. If , T) has models, Max-DP returns their optimal cost. Else it returns T. 
4.4 Extending DP 



The following example shows that, unlike classical resolution, the unrestricted ap- 
plication of Max-RES does not guarantee terminatiorQ. 

Example 18 Consider the following formula {(xV 1), (.^ Vz, 1)} with T = 3. If 
we apply Max-RES, we obtain { ( j V z, 1 ) , (x V y V f , 1 ) , (x V y V z, 1 ) }. 7/" we apply 
Max-RES to the first and second clauses we obtain { (.jc V 1 ) , (.^ V j V z, 1 ) , (.f V 
jVz, 1)}. If we apply now Max-RES to the second and third clauses we obtain 
{ (;c V 1 ) , V z, 1 ) }, which is the initial formula. 

Nevertheless, Bonet et al. [35] have recently proved that when all clauses are non- 
mandatory, the directional application of Max-RES solves the Max-SAT problem. 
If their proof is combined with the proof of correctness of DP [28] (namely, all 
clauses being mandatory), we have that the extension of DP to Max-SAT pro- 
duces a correct algorithm. Max-DP (depicted in Figure 5) is the extension of DP 
to Max-SAT. Both algorithms are essentially equivalent the main difference being 



This fact was first observed in the WCSP context by [34] 
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that Max-DP performs Max-RES instead of classical resolution. Observe the par- 
allelism between Function VarElim (Fig. 3) and Function Max-VarElim (Fig. 
5). Both are in charge of the elimination of variable jc, from the formula. As in the 
SAT case, Max-VarElim computes the bucket "B (line 1) and removes its clauses 
from the formula (line 2). Then, it selects a clause (xVA, m) and resolves it with all 
its clashing clauses. In VarElim clause .jcV A is resolved until no clashing clauses 
exist. In Max-VarElim clause {xV A,u) is resolved until its weight u decreases 
to or no clashing clauses exist. A worth noting difference with respect to the SAT 
case is that Max-VarElim selects in line 4 a minimal size clause. Such minor dif- 
ference is not required for the correctness of the algorithm but only to achieve the 
complexity stated in Theorem 23. 

The following lemma shows that Max-VarElim transforms the input formula 
preserving its optimality. 

Lemma 19 Consider a call to the Max-VarElim function. Let (iF,T) denote 
the input formula and let ( T) denote the output formula. It is true that , T) 
has models iff {!F',T) has models. Besides, «/ (iF , T) has models, the cost of the 
optimal one is the same as the cost of the optimal model of{!F',T). 

PROOF. See Appendix A. 

Theorem 20 Algorithm Max-DP is correct. 

PROOF. Max-DP is a sequence of variable eliminations until variable-free for- 
mula is obtained. Lemma 19 shows that each transformation preserves the cost of 
the optimal model. Therefore, the cost of the final variable-free formula (□,«) is 
the cost of the optimal model of the original formula. 

The following lemma, shows that it has the same complexity to eliminate a variable 
in classical SAT and in Max-SAT. 

Lemma 21 Let , T) be a Max-SAT instance and Xj one of its variables. Let nt 
denote the number of variables sharing some clause with Xj in J . The space and 
time complexity ofMax-VarElim(f, T,Xi) is 0(3"') and 0(9"'), respectively. 

PROOF. See Appendix A. 

The next lemma, shows that the induced graph plays the same role in DP and in 
Max-DP 

Lemma 22 Let d denote the reverse order in which Max-DP(!F ,T) eliminates 
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variables. The width of Xi along d in the induced graph G{fY^ bounds above the 
number of variables sharing some clause with Xi at the time of its elimination. 



PROOF. Same as the SAT case (Lemma 3). 



The following theorem, which trivially follows from the previous two lenmias, 
bounds the complexity of Max-DP. 

Theorem 23 Let (jF , T) be an arbitrary Max-SAT instance. Let d denote the re- 
verse order in which Max-DP( , T j eliminates variables. The space and time com- 
plexity of DP(!F ) is 0{n x 3^d) and 0{n x 9^^), respectively, where is the in- 
duced width of the interaction graph G{!F) along d. 

Observe that the complexities of DP and Max-DP are the same, even though Max- 
SAT has a complexity higher than SAT. The same phenomenon has already been 
observed with respect to CSP and its optimization version WCSP when using the 

bucket- elimination [23] algorithm. Note that bucket-elimination is a meta-algorithm 
based on the variable-elimination principle and DP and Max-DP are particular in- 
stantiations of it. The following remark shows that Max-DP is a true extension of 
DP 

Remark 24 The execution of Max-DP with a SAT instance (i.e., (jf , T) with T 
1 ) behaves like classical DP. 



5 Efficient Inference 



The complexity results of the previous section show that solving Max-SAT with 
pure resolution methods is in general too space consuming and can only be used in 
practice with formulas with a small induced width (around 30 with current comput- 
ers). A natural alternative is to use only restricted forms of resolution that simplify 
the formula and use search afterwards. In this Section we summarize some simpli- 
fication rules that have been proposed in the recent Max-SAT literature and show 
that they can be naturally explained with our framework. We also introduce two 
original ones that will be used in the solver that we will introduced in Section 6. 

We classify these simplification rules in three categories: single applications of res- 
olution, multiple applications of resolution (namely, hyper-resolution), and variable 
elimination. 
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5.1 Single Resolution 



Proposition 25 Unit clause reduction (also called upper bound rule in [ 13]), 

{(/,T),(/VA,w)} = {(/,T),(A,w)} 
is a particular case ofMax-RES. 

PROOF. If w = T, we have the classical SAT case, which is trivial. If w < T, we 

have that the application of Max-RES to {(/ V □, T), (/V A, w)} produces {(A, w), (/,Te 
T), (/ VA,we w), (/ V □ VA,w), (/V -•□ VA,m)} 

The third clause can be removed because w w = 0. The fourth clause can be 
removed because it is absorbed by the second. The fifth clause can be removed 
because it is a tautology. 

Proposition 26 Neighborhood resolution [1] (also called replacement of almost 
common clauses in [8]), 

{{lVA,u),{l\/A,w)} = {{A,w),{lVA,uew)} 

where, without loss of generality, w <u, is a particular case of Max-RES. 



PROOF. Resolving the two left-hand clauses, we obtain {(A,w), (/VA, wGw), (/V 
A,wG w), (/ VA VA,>v), (/ VA VA,w)}. The third clause can be omitted because 
either its weight is (when w < T), or it is absorbed by the resolvent (when w = T). 
The fourth and fifth clauses can be omitted because they are tautologies. 



The simplification potential of neighborhood resolution is shown in the following 
example. 

Example 27 Consider the formula { (z V >>, 1 ) , ( j V z, 1 ) , (z, 1 ) }. The application of 
neighborhood resolution yields {(z, 1), (z, 1)} which allows a new application of 
neighborhood resolution producing the trivial formula {(□, 1)} 

The term neighborhood resolution was coined by [36] in the SAT context. The 

Max-SAT extension was first proposed in [8]. The practical efficiency of the |A| = 
0, 1,2 cases was assessed in [37,38], [14] and [1], respectively. In the WCSP con- 
text, it is related to the notion of projection and has been used to enforce node and 
arc-consistency [34,33]. 



18 



5.2 Variable elimination 

Proposition 28 The pure literal rule (first proposed in the Max-SAT context in [8]) 
is a special case ofMax-VarElim 

PROOF. Consider a formula T such that there is a literal /, whose negation does 

not appear in the formula. Let x = var{l). Function Max-VarElim(iF , T,x) has 
the same effect as the pure literal rule, because there is no pair of clauses clashing 
on X. Thus, no resolution will be performed and all clauses containing / will be 
removed from the formula. 

Proposition 29 The elimination rule [8] (also called resolution in [9,10]) which 
says that if 7 = {{l\/A,u),{l\/ B,w)}LI 7' and var{l) does not occur in J ' then 

J = j'u{(AV5,min{M,w})} 

is a special case of Max-VarElim 

PROOF. Let X be the clashing variable (namely, x = var{l)). We need to prove that 
Function Max-VarE 1 im with x as the elimination variable replaces { (/ VA, u) , (fv 
5, w)} by {(A V5,min{M, w})}. There are two possibilities: If {(/ VA,m), (/ V5,w)} 
clash, they will be resolved and (A V5, min{M, w}) will be added to the formula. All 
the clauses in the bucket after the resolution step do not clash on jc, so Max-VarE 1 im 
will discard them. If {(/ VA,m), (ly B,w)} do not clash, Max-VarElim will di- 
rectly discard them. In that case, A\J B either is a tautology or is absorbed, so it has 
no effect on the right-hand side of the elimination rule. 

Proposition 30 Let x denote either x or x. The small subformula rule [9], which 
says that, if 7 = {{x\/ y\/ A,u),{x\/ y\/ B,w),{x\/ y\/ C,v)} L) 7 ' and x,y do not 
occur in 7 ' then 

7=7' 

is a special case of Max-VarElim 

PROOF. We only need to prove that if we eliminate x and y from {(xV>'VA,M),(iV 
y V5,w), (Jc Vy VC,v)} with function Max-VarElim, we obtain the empty for- 
mula 0. 

If all the occurrences of x or j have the same sign, the rule holds because to the 
pure literal rule can be applied. If there are occurrences of different sign, there are 
only two cases to consider (all other cases are symmetric): 

• If we have {{xVyVA^u), (xVjV5, v), (xVyVC^w)}, there are no clauses clash- 
ing on X (neither ony), so Max-VarElim will just discard the clauses. 
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{x\/ A,uQm) 
{xVB.wem) 
x\/A\/B,m) 
x\/A\/B,m) 



Fig. 6. Graphical representation of Max-RES. 

• If we have { V j V A , w) , V j V 5, v) , V j V C, w) } , the first and second clauses 
clash, so Max-RES produces, 

{{y\/A\/B,m),{xVy\/A,uem),{x\/yVB,vem),{xVy\/A\/yVB,m), 



(jc V J V A V J V C, m) , (jc V J V 5, w) } 
which is equivalent to, 

{(};VAV5,m),(xVjVA,Mem),(xVjV5,vem),(xVyVAV5,m),(xVAVjVC,m), 

{xVyVB,w)} 

There are no further clauses clashing on x, so Max-VarElim will just discard 
all the clauses that mention it, producing the equivalent {(y VA V5,m)}. The 
pure literal rule will eliminate the clause, producing the empty formula. 



5.3 Hyper-resolution 

Hyper-resolution is a well known SAT concept that refers to the compression of 
several resolution steps into one single step. In the following, we introduce four 
hyper-resolution inference rules. The first two (star rule and dominating unit-clause) 
are formal descriptions of already published rules. The other two rules {cycle and 
chain resolution) are original. We prove the correctness of these rules by develop- 
ing the resolution tree that allows to transform the left-hand side of the rule into the 
right-hand side. Figure 6 shows the graphical representation of Max-RES. On top 
there are the two prior clashing clauses. We write them in bold face to emphasize 
that they are removed from the formula. The resolvent is linked to the prior clashing 
clauses. At the left of the resolvent, we write the posterior clashing clauses and the 
compensation clashing clauses, which must be added to preserve equivalence. 



(xVA,u) (xVB,w) 



(AVfi,m) 
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(liVl2---Vlk,w) (Ii,Ui) 



em) 



(/iV(/2V---V4),m) (l2Vl3---,Vlk,m) (l2,U2) 
{hVhV ...lk,wem) 



(/2V(/3V---V/fe),m) 



(l3Vl4---Vlk,m) (I3,U3) 



{l_k-i,Uk-i em) 
{lk-iVlk,m) 



{lk,Ukem) 



(lk,m) (lk,Uk) 



(□,m) 



Fig. 7. Resolution tree of the star rule. 



5.3.1 Star rule 



The star rule [9,14] identifies a clause of length k such that each of its literals 
appears negated in a unit clause. Then, at least one of the clauses will be violated. 
Formally, 



(/iV/2V.../^,w), 

{li-,Ui)\<i<k-, 



(/i V/2V...4,>vem), 

{k V V li+2 V ... V 4) , m)i<i<k, 

{li,Uiem)i<i<k, 

(□,m) 
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(5i,wi) 
(/V5i,wi) 



(52,W2) 
(/V52,W2) 



(IWBk-uWk-i) 



iBk,Wk) 



(l,u) (lVBi,wi) 



(l,uewi) (1VB2,W2) 



(l,ueWieW2) (1VB3,W3) 



(i,uewiew2---ewk-i) (ivBk,wk) 



(i,uewiew2---ewk) 



Fig. 8. Resolution tree the dominating unit clause rule. 
where m = min{w, ui,U2, - ■■, u^}. 

This rule can be proved in k resolution steps. Assume, without loss of generality 
that \/i<i<:k Hi < Ui+i. Assume as well that Uk<T (otherwise unit clause reduction 
could have been previously triggered). Let m = min{w, mi}. Figure 7 shows the 
corresponding resolution tree. Recall that bold clauses are resolved, so they must 
be removed from the formula. Essentially, each unit clause is used to eliminate one 
literal from the length k clause. At the end, we derive the empty clause. 



5.3.2 Dominating unit-clause 

The dominating unit-clause rule [9] (also called UPS in [13]) says that if the weight 
if a unit clause (/, m) is higher than the sum of weights in which / appears, we can 
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safely assign / to the formula. Formally, 



J = {(/, u)} U {(/ V A,-, MO}f=i U {ilVBj,Wj)})^i U !r ' 
with u > Lj=i and f ' does not contain any occurrence of / or /, then 

This rule can be proved in k resolution steps plus the application of the pure literal 
rule. Figure 8 shows the corresponding resolution tree. As in the previous case, we 
can assume that weight u is less than T because otherwise the unit clause reduction 
could have been triggered. At each step unit clause / is resolved with one (/ V 
Bj,Wj). Since, by definition, the weight of / is larger than or equal to Wj, clause 
ly Bj is replaced by Bj. At the end of the process, there is no clause mentioning /, 
so the pure literal rule can be applied, which proves the correctness of the rule. 



5.3.3 Chain resolution 

Our original chain resolution rule, identifies a subset of chained binary clauses and 
two unit clauses associated to the ends. When such pattern exists, a sequence of 
unit resolution steps suffices to derive the empty clause. The rule is the following. 



(Zl,Ml), 



{li,miemi+i)i<i<k, 

*• = \ {h\/h+i,m+i)i<i<k, 
{lk,Uk+iQmk+i) 



where nii — min{Mi, M2, . . . , Ui} and ^i<i<j<k var{li) ^ var{lj). This rule can also 
be proved in k steps of resolution. Figure 9 shows the corresponding resolution 
tree. Starting with unit clause l\, at each resolution step a unit clause is resolved 
with {li V 1 , 1 ) , which produces the unit clause 1 to be used in the following 
resolution step. The last unit clause obtained is 1^ and it is resolved with (4, m^+i), 
which derives the empty clause. 

Example 31 Consider the following formula {{x,2) ^{x\/ y,\) ,{y\/ z, T) , (f, 2) }. If 
we resolve (x, 2) and {xM yA) we obtain {(x, 1), (j, 1), (x V)^, 1), (j^ V z, T), (f, 2)}. 
If we resolve (j, 1) and (j^ V z, T) we obtain {(x, 1), (.x; V j^, 1), (z, 1), V z, 1), V 
z, T),(z, 2)}. Next, if we resolve (z, 1) and (z, 2), we obtain {(x, 1), (xV 1), (jV 
z,l),(yVz,T),(z,l), (□,!)} 
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(Zl V/2,M2 0W2) 
(/l V/2,"^2) 

(/l,mi em2) 



{hyh.u-iQm-i) 

(/2V/3,m3) 

{h,m2Qm-i) 



{lk-\ylk,UkQmk) 

{lk-i\/h,mk) 

{lk-i,mk-iemk) 



(ll,Ui) (IiVl2,U2) 




(l2,m2) (l2Vl3,U3 



(l3,m3) (l3Vl4,U4) 



{l_k,mkemk+i) 




(lk,mk) (lk,Uk+i) 



Fig. 9. Resolution ti-ee of chain resolution. 



Observe that chain resolution with k— I reduces to neighborhood resolution, with 
k = 2 reduces to the star rule, with ^ = 3, it is the 3-RES rule proposed in [2]. 
Chain resolution with ^ = 2 is also related to the enforcement of existential arc 
consistency in WCSP [22]. 



5.3.4 Cycle Resolution 

Our original cycle resolution, identifies a subset of binary clauses with a cyclic 
structure. When such a pattern exists, a sequence of resolution steps with binary 
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(IlVl2,Ui) (l2Vl3,U2) 



{li\/ l2,miQm2) 
{hy h,U2_Qm2) 

(/iV/3,m2em3) 

(/3 V/4,M3em3) 

(/lV/3V/4,m3) 

(/lV/3V/4,m3) 




(llVl3,m2) (l3Vl4,U3) 




(llVl4,m3) (l4Vl5,U4) 




(/_! V4_i,mfc_2em^_i) _ 

{h-i V 4, Uk-i e mk-\) (h V Ik, nik-i) (hy ik, Uk) 

{hV k-iV lk,mk-i) 



{hylk.mk-iQnik) 
{hVlk,Ukemk) 




Fig. 10. Resolution tree of cycle resolution. 



clauses suffices to derive a new unit clause. The rule is the following. 



(/"l V/,-,m;_iem;)2<Ki:, 

[h V h+i, Ui Q mi)2<i<k, 
{h y hy li+i,mi)2<i<k, 
{hyhy li+i,mi)2<i<k, 

{h,mk) 
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function Simplify(J,T) 

1. stop := false 

2. do 

3. if (/,T) GjF then apply 

4. elseif {(C,m), (C,w)} C then apply Aggregation 

5. elseif {(□,«), (C,w)} C J Am©w = T then apply Hardening 

6. elseif {(;cVA,m), (^VA,^)} C y then apply Neighbourhod Res. 

7. elseif {(/i, Ml), (Z^- V_/,+i,M,-+i)i<,-<yt, (4,M/t+i)} C y then apply Chain Res . 

8. elseif {{(/V/j,m), (/V^,v), (/iV^,w)}} C J then apply Cycle Res. 

9. until (((□,T) e j)Vs?o/7) 

10. return (J ) 
endfunction 

function Max-DPLL(ir , T) return nat 

11. J := Simplify(jF,T) 

12. if :F = then return 

13. if J ={(n,ii')} then return w 

14. / :=SelectLiteral(^) 

15. T :=Max-DPLL(J [/], T) 

16. T :=Max-DPLL(jF [/],T) 

17. return T 
endfunction 

Fig. 11. Max-DPLL enhanced with inference. Function Simplify(jF,T) converts the 
input formula into a simpler one. Note that in our implementation, for efficiency reasons, 
we only consider the |A| < 1 and |C| < 2 case. 



where m,- = min{Mi,M2, . . . and 'i\<i<_j<k var{li) ^ var{lj). This rule can be 
proved in ^ — 1 steps of resolution. Figure 10 shows the corresponding resolution 
tree. The use of the cycle rule is to derive new unit clauses that, in turn, can be used 
by chain resolution to increase the weight of the empty clause. 

Example 32 Consider the formula { (xi Vx2 , 1 ) , (xi VX3 , 1 ) , (x2 VX3 , 1 ) , (x3 Vx4 , 1 ) , (x4 V 
X5, 1), (x5, 1)}. We can apply the cycle rule to the three first clauses obtaining, 
{(x3, 1), (jci Vx2 Vx3, 1), (xi Vx2 Vx3, 1), (x3 Vx4, 1), (X4VX5, 1), (^5, 1)}. Chain res- 
olution can be applied to the unary and binary clauses producing, { {x\ V X2 V 

X3, 1), (Xi VX2 VX3, 1), (X3 VX4, 1), (X4 VX5, 1), (□, 1)}. 

Observe that cycle resolution with = 3 is one particular case of the so-called 
high-order consistencies proposed in [39] for WCSP. In particular, it is a weighted 
version restricted to boolean variables of path inverse consistency [37]. 
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6 An efficient Max-SAT solver 



In the previous section we presented a set of simplification rules. Some of them 
have been previously proposed by other researchers, while some others are origi- 
nal. We showed that all of them can be view as special cases of resolution, hyper- 
resolution or variable elimination. In this Section we consider their incorporation 
into the Max-DPLL algorithm introduced in Subsection 4.2. The idea is to use these 
rules to simplify the current Max-SAT formula before letting Max-DPLL branch on 
one of the variables. Our experimental work indicates that it is not cost effective to 
apply all of them on a general basis. We have observe that only three rules are 
useful in general: neighborhood resolution, chain resolution and cycle resolution. 
Besides, it only pays off to apply these rules to clauses of very small size (up to 
2). The reason being that there is only a quadratic number of them which bounds 
the overhead of the detection of situations when they can be triggered. Regarding 
cycle resolution, we only found effective to apply the A: = 3 case (namely, consid- 
ering triplets of variables). Note that the fact that our solver only incorporates these 
three rules, does not prevent other rules from being effective in classes of problems 
where we did not experiment. 

A high-level description of our solver appear in Figure 11. It is Max-DPLL aug- 
mented with the simplification rules in function Simplify. This function itera- 
tively simplifies the formula. It stops when a contradiction is derived or no further 
simplification can be done (line 9). Simplification rules are arranged in an ordered 
manner, which means that if two rules R and R' can be applied, and rule R has 
higher priority than rule R', the algorithm will chose R. The rules with the highest 
priority are unit clause reduction and absorption grouped in the assignment [/] 
operation (line 3). Next, we have aggregation (line 4), hardening (line 5), neigh- 
borhood resolution (line 6), chain resolution (line 7) and cycle resolution restricted 
to cycles of length 3 (line 8). 

Although our actual implementation is conceptually equivalent to the pseudo-code 
of Figure 1 1 it should be noted that such code aims at clarity and simplicity. Thus, 
a direct translation into a programming language is highly inefficient. The main 
source of inefficiency is the time spent searching for clauses that match with the 
left-hand side of the simplification rules. This overhead, which depends on the 
number of clauses, takes place at each iteration of the loop. As we mentioned, 
our current implementation only takes into account clauses of arity less than or 
equal to two. Another way to decrease such overhead is to identify those events 
that may raise the applicability of the transformations. For instance, a clause may 
be made mandatory (line 5) only when its weight or the weight of the empty clause 
increases. Then, our implementation reacts to these events and triggers the corre- 
sponding rules. Such approach is well-known in the constraint satisfaction field and 
it is usually implemented with streams of pending events [40,22]. 
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The way in which we detect the chain resolution pattern also deserves special con- 
sideration. At each search node, we consider the set of binary and unary clauses 
and compute the corresponding implication graph defined as follows: 

• for each variable Xi, the graph has two vertices Xi and Xi, 

• for each binary clause {Uy lj,u), the graph has two arcs: (/i,//) and {IjJi). We 
say that these two arcs are complementary. 

• if the formula contains the unit clause we say that vertex / is a starting 
vertex, and vertex / is an ending vertex. 

It is easy to see that if there is a path {h.h, ■ ■■,h)^ where h and 4 are starting and 
ending, respectively, and the path does not cross any pair of complementary arcs, 
then chain resolution can be applied and the path tells the order in which resolution 
must be applied. 

In our implementation, we select one arbitrary starting vertex and compute short- 
est paths to all ending vertices using Dijkstra's algorithm. If one of the paths does 
not cross complementary arcs, we trigger the rule. Else, another starting vertex is 
selected and the process is repeated. Note that this method does not necessarily 
detect all the potential applications of chain resolution because it only takes into 
consideration one path between each pair of starting and ending vertices (the short- 
est path given by Dijkstra). The fact that this path crosses complementary arcs does 
not prevent the existence of other paths that do not cross complementary arcs. We 
believe that a better approach would be to use a flow algorithm, but we have not yet 
studied this possibility. 



7 Experimental Results 

We divide the experiments in two parts. The purpose of the first part is to assess the 
importance of each one of the inference rules that our solver incorporates. These ex- 
periments include random Max-SAT instances and random Max-clique problems. 
The purpose of the second part is to evaluate the performance of our solver in com- 
parison to other available solving techniques. These experiments include random 
weighted and unweighted Max-SAT instances, random and structured Max-one 
problems, random Max-cut problems, random and structured max-clique problems 
and combinatorial auctions. 

Our solver, written in C, is available as part of the TOOLBAR software[£|- Bench- 
marks are also available in the TOOLBAR repository. In all the experiments with 
random instances, samples have 30 instances and plots report mean cpu time in 
seconds. Executions were made on a 3.2 Ghz Pentium 4 computer with Linux. Un- 

^ http://carlit.toulouse.inra. f r/cgi-bin/awki .cgi/Toolbarlnfo 
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less otherwise indicated, executions were aborted when they reached a time limit 
of 1200 seconds. In all the plots' legend, the order of the items reflects the relative 
performance order of the different competitors. 

7. 1 Adding Inference to Max-DPLL 

We consider the following versions of our solver: 

(1) Basic Max-DPLL. Namely, Algorithm 1 1 in which lines 6-8 in Function S impl i f y 
are commented out. We denote this algorithm Max-DPLL- 1 . 

(2) The previous algorithm enhanced with neighborhood resolution (namely, lines 
7-8 in S impl i f y are commented out). We denote this algorithm Max-DPLL- 
2. 

(3) The previous algorithm enhanced with chain resolution (namely, line 8 in 
Simplify is commented out). We denote this algorithm Max-DPLL-3. 

(4) The previous algorithm enhanced with cycle resolution (namely, all the lines 
in Simplify are considered). We denote this algorithm Max-DPLL-4. 

For the first experiment we consider random Max-SAT instances. A random ^-SAT 
formula is defined by three parameters < k,n,m >. k is the length of the clauses, 
n is the number of variables and m is number of clauses. Each clause is randomly 
generated by selecting k distinct variables with a uniform probability distribution. 
The sign of each variable in each clause is randomly decided. In the following 
experiments we generate instances in which the number of clauses is always suffi- 
ciently high as to make the formula unsatisfiable and we solved the corresponding 
Max-SAT problem. We used the Cnfgei^^l generator. Note that it allows repeated 
clauses, so v repetitions of a clause C are grouped into one weighted clause (C, v). 

Figure 12 (top-left) reports results on random Max-2-SAT instances with 100 vari- 
ables with varying number of clauses. It can be seen that Max-DPLL- 1 performs 
very poorly and can only solve instances with up to 200 clauses. The addition of 
neighborhood resolution (namely, Max-DPLL-2) improves its performance by 2 
orders of magnitude and allows to solve instances with up to 300 clauses. The fur- 
ther addition of chain resolution provides a spectacular improvement which allows 
to solve instances with up to 750 clauses. Finally, the addition of cycle resolution 
allows to solve in 100 seconds instances of up to 1000 clauses. 

The Max-Clique problem is the problem of finding the maximum size clique em- 
bedded in a given graph. It is known that solving the Max-clique problem of graph 
G = (V, E) is equivalent to solving the Min-covering problem of graph G' = {V, E') 
where E' is the complementary of E (namely, (w, v) G E' iff (m, v) ^ E). Therefore, 

^ ftp://dimacs.rutgers. edu/pub/challenge/ sat isf lability/ 
contributed/UCSC/ instances 



29 




Fig. 12. Experimental results of different algorithms on random Max-SAT and Max-clique 
instances. 



we solved Max-clique instances by encoding into Max-SAT the corresponding min- 
vertex problem as described in Example 5. 

A random graph is defined by two parameters <n,e > where n is the number of 
nodes and e is the number of edges. Edges are randomly decided using a uniform 
probability distribution. Figure 12 (bottom) reports the results of solving the max- 
clique problem of random graphs with 150 nodes and varying number of edges. 
It can be observed that the instances with connectivity lower than 50 percent are 
trivially solved by our 4 algorithms. Note that instances with small connectivity 
have an associated Max-SAT encoding containing a large number of hard clauses. 
Hence, the unit clause reduction rule is applied very frequently on those instances. 
This is the reason why they are so easily solved. However, as the connectivity is 
increased, the differences between all the versions are greater. Little improvement is 
noticed for Max-DPLL-2 over Max-DPLL-1. For connectivities between 76% and 
99% the greatest differences are found. While Max-DPLL-1 and Max-DPLL-2 are 
unable to solve those instances, both Max-DPLL-3 and Max-DPLL-4 perform well. 
With a connectivity near to 90%, it can be observed that using the cycle resolution 
reports a noticeable improvement. 

From these experiments we conclude that the synergy of the three inference rules 
of Max-DPLL-4 produces an efficient algorithm. 
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7.2 Max-DPLL versus alternative solvers 

In the following experiments, we evaluate the performance of Max-DPLL-4 (we 
will refer to it simply as MAX-DPLL). For that purpose, we compare Max-DPLL 
with the following state-of-the-art Max-SAT solvers: Maxsolver [13], UP [41] 
and LB 4 A [12]. They suffer from the following limitations: 

• The available version of Maxsolver is restricted to instances with less than 
200 variables and 1000 clauses. 

• For implementation reasons, UP cannot deal with instances having clauses with 
high weights. Similarly, it cannot also deal with instances that combine manda- 
tory and weighted clauses. 

• LB4a can only solve unweighted Max-2-SAT problems (i.e, it is restricted to 
binary clauses with unit weights and without repeated clauses). 

Consequently, in the experiments we will only execute a solver if it is possible, 
according to its limitations. 

It is known that Max-SAT problems can also be solved with pseudo-boolean and 
SAT solvers. For the sake of a more comprehensive comparison, we also consider 
Pueblo [42] and Minis at [43], which are among the best pseudo-boolean and 
SAT solvers, respectively. In appendix B, we describe how we translated the Max- 
SAT instances into these two frameworks. Note that pseudo-boolean formulas are 
equivalent to 0-1 integer linear programs (ILP). Thus, they can also be solved with 
a state-of-the-art ILP solver such as CPLEX. We have not considered this alter- 
native because [11] showed that it is generally ineffective for Max-SAT instances. 
Max-SAT problems can also be solved with WCSP solvers [11]. We have not con- 
sider this type of solver in our study, because the reference WCSP solver is MEDAC 
[22], which uses techniques similar to those of Max-DPLL and can be roughly de- 
scribed as a non-boolean restricted version of Max-DPLL-3. 

7.2.1 Random Max-k- SAT 

For the following experiment, we generated random 2-SAT instances of 60 vari- 
ables and 3-SAT instances of 40 variables with varying number of clauses using 
the Cnfgen generator. We also generated random 2-SAT instances of 140 variables 
using the 2-SAT generator of [12] that does not allow repeated clauses. 

Figure 13 (top-left) presents the results on Max-2-SAT without repeated clauses. 
It can be observed that Max-DPLL is the only algorithm that can solve prob- 
lems of up to 1000 clauses. The solver with the second best performance, UP, 
is 6 times slower. A surprising observation is that the LB 4 A solver, which was 
specifically designed for Max-2-SAT without repetitions, performs worse than the 
other Max-SAT solvers in random unweighted Max-2-SAT. Figure 13 (top-right) 



31 



Max-2-SAT, 140 vars 



Max-2-SAT, 60 vars 



Pueblo 










Minisat 










Lazy 










LB4a 










MaxSolver 










UP 










Max-DPLL 



















































100 200 300 400 500 600 700 
n. of clauses 



Minisat - 
Pueblo 
Lazy 
MaxSolver 
UP 

Max-DPLL 



Max-3-SAT, 40 vars 



; Pueblo — > — 

1 Minisat — -k— 

1 UP 

,' MaxSolver « 

J, Lazy 

; Max-DPLL - - e— 



400 600 SOO 

n. of clauses 



100 200 300 400 500 600 700 
n. of clauses 



Fig. 13. Random Max-2-SAT and Max-3-SAT. Max-2-SAT instances on the plot on the left 
do not contain repeated clauses. 

presents the results on Max-2-SAT with repeated clauses. MAX-DPLL is again the 
best algorithm. The second best solver, UP, is nearly 100 times slower in the hard- 
est instances. Figure 13 (bottom) presents the results on Max-3-SAT. Max-DPLL 
provides again the best performance. The second best option Lazy is about 10 
times slower. A worth noting observation is that the alternative encodings (namely, 
pseudo-boolean and SAT) do not seem to be effective in these instances. 



7.2.2 Max-one 

Given a satisfiable CNF formula, max-one is the problem of finding a model with 
a maximum number of variables set to true. This problem can be encoded as Max- 
SAT by considering the clauses in the original formula as mandatory and adding a 
weighted unary clause (jC(, 1) for each variable in the formula. Note that solving this 
problem is much harder than solving the usual SAT problem, because the search 
cannot stop as soon as a model is found. The optimal model must be found and its 
optimality must be proved. 

Figure 14 shows results with random 3-SAT instances of 150 variables. Note that 
UP can not be executed in this benchmark because it cannot deal with mandatory 

and weighted clauses simultaneously. The first thing to be observed is that LAZY 
and Minisat do not perform well. Regarding the other solvers. Pueblo is the best 
when the number of clauses is very small, but its relative efficiency decreases as 
the number of clauses grows. MaxSolver has the opposite behavior, and Max- 
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Fig. 14. Random Max-one instances. 



DPLL always lay in the middle. The perforaiance of all these solvers converge 
as the number of clauses approaches the phase transition peak. The reason is that, 
as the number of models decreases, the optimization part of the Max-one problem 
loses relevance (the number of models to chose from decreases). 

Table 15 reports results on the Max-one problem on selected satisfiable SAT in- 
stances from the DIMACS challenge. The first column indicates the name of the 
problem classes. The second column indicates the number of instances of each 
class. The rest of columns indicate the performance of each solver by indicating the 
number of instances that could be solved within the time limit. If all the instances 
could be solved, the number in parenthesis is the mean time in seconds. The "-" 
symbol in the MaxSolver column indicates that the instances could not be exe- 
cuted due to the limitation that this solver has on the maximum number of variables 
and clauses. As can be observed, MaxSolver and Lazy do not succeed in this 
benchmark, which means that MAX-DPLL is the only Max-SAT solver that can 
deal with it. Its performance is comparable to the good performance of MINISAT 
and Pueblo. However, in the parl6*c* instances Max-DPLL performs badly, 
while in the par8* instances it performs better than the others. 

7.2.3 Max-cut 

Given a graph G = (y,^), a cut is defined by a subset of vertices U <^V . The size 
of a cut is the number of edges (v/, Vj) such that e U and vj & V — U. The Max- 
cut problem consists on finding a cut of maximum size. It is encoded as Max-SAT 
associating one variable Xi to each graph vertex. Value t (respectively, f) indicates 
that vertex Vi belongs to U (respectively, to V — U). For each edge (v/,Vj), there 
are two clauses Xi\/ xj^XiV xj. Given a complete assignment, the number of vio- 
lated clauses is \E\—S where S is the size of the cut associated to the assignment. 



33 



Problem 


n. inst. 


MaxDPLL 


MaxSolver 


Lazy 


Minisat 


Pueblo 


aimSO* 


16 


16(0.59) 


16(0.12) 


16(28.25) 


16(0.01) 


16(0.00) 


aimlOO* 


16 


16(2.67) 


16(4.92) 





16(0.02) 


16(0.00) 


aim200* 


16 


9 


4 





16(0.03) 


16(0.00) 


jnh* 


16 


16(1.49) 




6 


16(0.08) 


16(0.10) 


ii8* 


14 


5 




1 


10 


3 


ii32* 


17 


11 







16 


15 


par8* 


10 


10(0.92) 




5 


10(16.39) 


10(26.52) 


parl6*c* 


5 


5(784.14) 







5(0.93) 


5(0.93) 



Fig. 15. Results for the Max-one problem on selected DIMACS SAT instances. 




Fig. 16. Random Max-cut instances. 



Note that this encoding produces an unweighted Max-2-SAT formula, so the LB 4 A 
solver can be used. Random Max-Cut instances are extracted from random graphs. 
We considered graphs of 60 nodes with varying number of edges. 

Figure 16 reports the results on this benchmark. It can be observed that for all 
solvers other than Max-DPLL, problems become harder as the number of edges 

increases. However, MAX-DPLL solves instances of up to 500 edges almost in- 
stantly. The second best solver is LB4A, but Max-DPLL is up to 15 times faster. 
Pueblo and Minisat perform so poorly even in the easiest instances that they are 
not included in the comparison. 
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7.2.4 Max-clique 



The Max-clique problem is the problem of finding the maximum size subgraph em- 
bedded in a given graph and its Max-SAT encoding was described in the previous 
subsection. Solvers UP, MaxSolver and Lb4a could not be executed in this do- 
main due to their limitations. Our first Max-clique experiment used random graphs 
with 150 nodes and varying number of edges. Figure 17 reports the results. Again, 
Max-DPLL is clearly better than any other competitor. All other competitors are 
more than 2 orders of magnitude slower than Max-DPLL. 

We also considered the 66 Max-Clique instances from the DIMACS challengeQ- 
MaxSolver could not be executed in this benchmark because the number of vari- 
ables and clauses of the instances exceeds its capacity. Thus, the only two Max-SAT 
solvers that could be executed are Max-DPLL and Lazy. Within the time limit, 
they solved 32 and 23 instances, respectively. MINISAT and PUEBLO could solve 
22 and 16 instances, respectively. Therefore, Max-DPLL provided the best per- 
formance in this benchmark, too. 

These instances have been previously used to evaluate several dedicated max clique 
algorithms. Performing a proper comparison with Max-DPLL is difficult because 
their code is not available and we would need to re-program their algorithms. How- 
ever, following the approach of [44], we overcome this problem by normalizing 
the reported times. Of course, this is a very simplistic approach which disregards 
very relevant parameters such as the amount of memory or the processor model. In 
consequence, the following results can only be taken as orientative. Giving a time 
limit of 2.5 hours per instance in our 3.2 Ghz computer, Max-DPLL was able to 
solve 37 instances. In an equivalent (via normalization) time, [45] solves 38, [46] 
soves 36, [47] solves 45, and [44] solves 52. 

7.2.5 Combinatorial Auctions 

Combinatorial auction allow bidders to bid for indivisible subsets of goods. Con- 
sider a set G of goods and n bids. Bid i is defined by the subset of requested goods 
G, C G and the amount of money offered. The bid-taker, who wants to maximize 
its revenue, must decide which bids are to be accepted. Note that if two bids request 
the same good, they cannot be jointly accepted [7]. In its Max-SAT encoding, there 
is one variable Xi associated to each bid. There are unit clauses (x,, w/) indicating 
that if bid / is not accepted there a loss of profit ui. Besides, for each pair /, j of 
conflicting bids, we add a mandatory clause (.f/ Vx}-, T). 

We used the CATS generator [48] that allows to generate random instances inspired 
from real-world scenarios. In particular, we generated instances from the Regions, 
Paths and Scheduling distributions. The number of goods was fixed to 60 and we 

^ f tp : //dimacs . rutgers . edu/pub/ challenge /graph/benchmarks /clique 
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MAXCLIQUE, 150 variables 




Fig. 17. Random Max-clique instances. 



increased the number of bids. By increasing the number of bids, instances become 
more constrained (namely, there are more conflicting pairs of bids) and harder to 
solve. UP, MaxSolver and Lb4a could not be executed due to their limitations. 
The Lazy solver could not be included in the Regions comparison due to overflow 
problems. 

Figure 18 (top-left) presents the results for the Paths distribution. Max-DPLL pro- 
duces the best results being 22 times faster than the second best option Lazy. Fig- 
ure 18 (top-right) presents the results for the Regions distribution. Max-DPLL is 
again the best algorithm. It is 26 times faster than the second best solver Pueblo. 
Finally, results for the Scheduling distribution are shown in Figure 18 (bottom). In 
this benchmark, the performance of Max-DPLL and Minisat are quite similar, 
while the other solvers are up to 4 times slower. 



8 Conclusions and Future work 



This paper introduces a novel Max-SAT framework which highlights the relation- 
ship between SAT and Max-SAT solving techniques. Most remarkably, it extends 
the concept of resolution. Our resolution rule, first proposed in [1], has been proved 
complete in [35]. There are many beneficial consequences of this approach: 

• It allows to talk about Max-SAT solving with the usual SAT terminology. 

• It allows to naturally extend basic algorithms such as DPLL and DP. 

• It allows to express several solving techniques that are spread around the Max- 
SAT literature with a common formalism, see their logical interpretation and see 
the connection with similar SAT, CSP and WCSP techniques. 
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Fig. 18. Combinatorial auctions. Top-left: Paths distribution. Top-right: Regions distribu- 
tion. Bottom: Scheduling distribution. 

From a practical point of view, we have proposed a hybrid algorithm that combines 
search and selected forms of inference. It follows a typical search strategy but, at 
each visited node, it attempts to simplify the current subproblem using special cases 
of resolution with which the problem is transform into a simpler, equivalent one. 
Our experiments on a variety of domains show that our algorithm is usually orders 
of magnitude faster than its competitors. 

Our current solver lacks features that are considered very relevant in the SAT con- 
text (for example clause learning, re-starts, etc). Since our framework makes the 
connection between SAT and Max-SAT very obvious, they should be easily incor- 
porated in the future. Additionally, some of the ideas presented in this paper have 
been borrowed from the weighted CSP field [17]. Therefore, it seems also possible 
to incorporate new (weighted) constraint processing techniques. Finally, we want to 
note the recent work of [41] in which very good lower bounds are obtained by tem- 
porarily setting T = 1 and simulating unit propagation. Since the hyper-resolution 
rules presented in Section 5.2 are special cases of their more general algorithm, we 
want to explore if their approach can be fully described with our resolution rule. 



A Correctness and Complexity of Max-VarElim 



In this appendix we prove Lemmas 19 and 21, which establish the correctness of 
the Max-VarElim function in Figure 5 and its time and space complexity. In the 
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proofs we borrow some ideas from [25,28,35] and adapt them to our framework. 

In the following, when we write C G we mean (C, u) E !F for some weight u 
(there is no ambiguity because all clauses in !F are different). We use symbol f l-^, 
!F ' to denote the application of a resolution step to formula 5 resulting in formula 
^ where the clashing variable was x, . Consider the elimination of variable Xi with 
Function Max-VarElim. First of all, the formula is partitioned into two sets of 
clauses, and J . Then, clauses of the form [xi VA, u) are fetched from « , resolved 
with clashing clauses until quiescence or disappearance and, finally, are discarded. 
Suppose that discarded clauses are stored in a set 'D. Formally, we can see the 
execution of Max-VarElim as a sequence of resolution steps, 

where = 0. For aWO <k<q: "3^ is a set of clauses that contain either X; or f]^ 
is a set of clauses that do not contain Xi neither Xi, and is a set of clauses that con- 
tainx/. Besides, ®^ does not have any clause withjc/. The output of Max-VarElim 
is J^q that, as we will prove, is essentially equivalent to the original formula. Let Ni 
denote the set of variables sharing clauses with x; in the starting "S (namely, "Bo), 

Ni = {xj^Xi\ XjEvar{C)} 

and let tii = \Ni\ be its cardinality. In the remaining of this appendix we will show 
that: the number of new clauses generated during the sequence of resolution steps is 
bounded by (9(3"') (space complexity), the number of resolution steps is bounded 
by 0(9"') (time complexity) and, from an optimal model of Jq we can be trivially 
generate an optimal model of the original formula "Bq U Jo (correctness). 

Observe that all the variables different from x, appearing in clauses generated by the 
resolution process must belong to A^, because resolution does not add new variables. 
Therefore, all the clauses in Bk have the form / VA where var{l) — Xi and var{A) C 
A^;. Variable Xi must appear in the clause either as a positive or negative literal 
(namely, there are 2 options) and every Xj e A^, may or may not appear in A and, 
if it appears, it can be in positive or negative form (namely, there are 3 options). 
Consequently, the size of is bounded by 2 x 3"' . For similar reasons, every clause 
C added to J during the resolution process satisfies that var(C) C Ni. Every xj e 
A^; may or may not appear in C and, if it appears, it may be positive or negative 
(namely, there are 3 options). Consequently, the number of non-original clauses in 
is bounded by 3"' . Therefore, the number of clauses added to (B and f during 
the execution of Max-VarElim is bounded by 2 x 3"' + 3"'. As a result, its space 
complexity is 0(3"'). 

Next, we analyze the time complexity. Recall that two clauses (x, V A,m), (x; V 
B,w) e J clash if Ay B: is not a tautology (i.e., V/^a I ^ B) and, Ay B e J is 
not absorbed (i.e, ^(c,T)^r C <^Ay B). We say that a clause (x/ VA, u) is saturated 
if there is no clause in J clashing with it. The following lemma shows that resolv- 



38 



ing on a clause, either removes the clause from the formula or reduces the number 
of clauses clashing with it. 

Lemma 33 Consider a resolution step h^c,. fp' where (X(VA,m) and (xjVBjw) 
are the clashing clauses. Then, either Xi VA^^'or the number of clauses clashing 
with Xi V A decreases. 

PROOF. We reason by cases: 

(1) IfM<worM = w<T then the posterior Xi V A has weight (namely, disap- 
pears from the formula). 

(2) If M = w = T then the effect of resolution is to add the resolvent to the formula 
(!?' = U (A V 5, T)). Then, XiVB does not clash with Xi V A anymore. 

(3) If u > w then Xi V 5 is replaced by V C V A in the formula. The new clause 
does not clash with VA, because A V 5 V A is a tautology. 

Consider the inner loop of Max-VarElim. It selects a clause Xi VA and resolves 
it until either it disappears or it saturates. If Xi VA saturates, it is removed from « 
and added to (D. We call this sequence of resolution steps the processing of Xi VA 
and use symbol I~*.va represent it. A consequence of the previous lemma is that 
the number of resolution steps required to process Xi VA is bounded by the number 
of clauses clashing with it. Note that the number of clauses clashing with (x, V A, u) 
is bounded by 3"' , because clashing clauses must belong to (B and variable Xj must 
occur negated. Therefore, for each iteration of the outer loop, the inner loop of 
Max-VarElim iterates at most 3"' times. 

Consider now the outer loop of Max-VarElim. It selects a sequence of clauses 
a:,- V Ai,x, V A2, . . . V Ai and processes them one after another. We can see this 
process as. 

Recall that the algorithm always selects for processing a clause V Aj of minimal 
size (line 4). Observe that the size of the compensation clause jc, VA V5 added to ® 
(line 9) is larger than the clause that is being processed. As a consequence, once a 
clause is processed, it does not appear again in , which means that ^i<j<j><s 
Aji. A direct consequence is that, since there are at most 3"' distinct Aj, the outer 
loop iterates at most 3"' . Therefore, the maximum number of iterations of the inner 
loop is 3"' X 3"' = 9"', which means that the time complexity of the function is 

0(9"')- 

Finally, we prove the correctness of Max-VarElim. 

Lemma 34 A saturated clause, remains saturated during any sequence of resolu- 
tion steps \-Xj. 
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PROOF. Consider a resolution step f h^,- ^ Let xi V A and Xi V 5 be the clashing 
clauses, and let Xi V C be a saturated clause of J . We only need to prove that jc, V C 
remains saturated in 5 Since, Xi V C is saturated in , either C V 5 is a tautology 
or it is absorbed in J . The only new clause in J ' that could clash with Xi V C is 
X; V5 VA. However, if CV5 was a tautology, so it is CVBvA. If CVB was absorbed 
inj, so it will CV5VA in j'. 

A consequence of the previous lemma is that at the end of the sequence of resolution 
steps performed by Max-VarElim we have a formula "B^^ U fks U "^ks such that all 
its clauses are saturated. 

To prove the correctness of Max-VarElim we only need to prove that any as- 
signment I of !Fk^ can be extended to variable x,- in a cost free-manner, taking into 

account the clauses Xi\/B e^B^^ and the clauses x,- VA G ©/t,, because it means that 
finding the optimal assignment of ^y^j is equivalent to finding the optimal assign- 
ment of iSyt, U fFks U f fe^ which, in turn is equivalent to finding the optimal assign- 
ment of the original formula. 

If !Byts = (resp. (Dk, = 0), variable Xi must be set to true (resp. false). Else, con- 
sider that there is a clause x,- V A G ©yt, such that / does not satisfy A (similarly 
for X,- V 5 G Variable x, must be set to true. We show that / satisfies every 
XiV B e "Bk^: Clause VA is saturated, then either A V5 is a tautology or there is 
a clause C G with C C A U5. In the first case, since / does not satisfy A, and 
since A V 5 is a tautology, this means that / satisfies B. In the second case, since I 
satisfies C and does not satisfy A, it must satisfy B. 



B Solving Max-SAT with Pseudo-boolean and SAT solvers 

In Linear pseudo-Boolean (LPB) problems over boolean variables {xi, . . . ,x„}, val- 
ues true and false are replaced by numbers 1 and 0, respectively. Literal /,• repre- 
sents either or its negation 1 — x/. A LPB problem is defined by a LPB objective 
function (to be minimized), 

n 

^ a,/( where G Z 

i=l 

and a set of LPB constraints, 

n 

^ij^i ^ where aij,bj, G Z, x, G {0, 1} 

i=\ 

A Max-SAT formula can be encoded as a LPB problem [1 1] by partitioning the set 
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of clauses into three sets: J-[ contains the mandatory clauses (C, T), 'W contains 
the non-unary non-mandatory clauses (C, w < T) and U contains the unary non- 
mandatory clauses (/, u). For each hard clause (Cy, T) E9{ there is a LPB constraint 
C'j> 1 , where C'j is obtained from Cy by replacing V by -I- and negated variables x 
by 1 —X. For each non-unary weighted clause (Cy , uj) eW there is a LPB constraint 
C'j + rj> 1, where C'j is computed as before, and rj is a new variable that, when set 
to 1, trivially satisfies the constraint. Finally, the objective function is, 

{Cj,rj)e'H' ilj,Uj)€U 

A LPB problem can be solved with a native LPB solver such as Pueblo or with a 
SAT solver. In the latter case, each LPB constraint must be converted into a logic 
circuit. There are different possible conversions such as BDDs, adders or sorters. In 
our experiments we used MiniSAT-i- [49], a translating tool that converts each PB 
constraint into the presumably more convenient circuit and solves the correspond- 
ing SAT formula with MiNiSAT. MIN1SAT+ converts the objective function of 
the LPB problem into another LPB constraint by setting an upper bound. The LPB 
problem is solved by decreasing the value of the upper bound until an infeasible 
SAT formula is found. 
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